rank | frequency | n-gram |
---|---|---|
1 | 23263 | -s |
2 | 14191 | -e |
3 | 10777 | -n |
4 | 9770 | -d |
5 | 7822 | -y |
rank | frequency | n-gram |
---|---|---|
1 | 6018 | -ed |
2 | 5722 | -ng |
3 | 5103 | -er |
4 | 4455 | -on |
5 | 4189 | -'s |
rank | frequency | n-gram |
---|---|---|
1 | 5270 | -ing |
2 | 2157 | -ion |
3 | 1681 | -ers |
4 | 1327 | -ted |
5 | 955 | -ent |
rank | frequency | n-gram |
---|---|---|
1 | 1666 | -tion |
2 | 991 | -ting |
3 | 615 | -ions |
4 | 514 | -ated |
5 | 507 | -ment |
rank | frequency | n-gram |
---|---|---|
1 | 1084 | -ation |
2 | 492 | -tions |
3 | 286 | -ating |
4 | 226 | -based |
5 | 226 | -ction |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings